Search CORE

654 research outputs found

Global Thresholding and Multiple Pass Parsing

Author: Goodman Joshua
Publication venue
Publication date: 01/01/1997
Field of study

We present a variation on classic beam thresholding techniques that is up to an order of magnitude faster than the traditional method, at the same performance level. We also present a new thresholding technique, global thresholding, which, combined with the new beam thresholding, gives an additional factor of two improvement, and a novel technique, multiple pass parsing, that can be combined with the others to yield yet another 50% improvement. We use a new search algorithm to simultaneously optimize the thresholding parameters of the various algorithms.Comment: Fixed latex errors; fixed minor errors in published versio

arXiv.org e-Print Archive

CiteSeerX

Efficient Algorithms for Parsing the DOP Model

Author: Goodman Joshua
Publication venue
Publication date: 01/01/1996
Field of study

Excellent results have been reported for Data-Oriented Parsing (DOP) of natural language texts (Bod, 1993). Unfortunately, existing algorithms are both computationally intensive and difficult to implement. Previous algorithms are expensive due to two factors: the exponential number of rules that must be generated and the use of a Monte Carlo parsing algorithm. In this paper we solve the first problem by a novel reduction of the DOP model to a small, equivalent probabilistic context-free grammar. We solve the second problem by a novel deterministic parsing strategy that maximizes the expected number of correct constituents, rather than the probability of a correct parse tree. Using the optimizations, experiments yield a 97% crossing brackets rate and 88% zero crossing brackets rate. This differs significantly from the results reported by Bod, and is comparable to results from a duplication of Pereira and Schabes's (1992) experiment on the same data. We show that Bod's results are at least partially due to an extremely fortuitous choice of test data, and partially due to using cleaner data than other researchers.Comment: 10 page

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Skills, Schools, and Credit Constraints

Author: Goodman Joshua
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2008
Field of study

Low college enrollment rates among low income students may stem from credit constraints, low academic skill, low quality schools, or some combination of these. Recent Massachusetts data allow the first use of school district fixed effects in the analysis of credit constraints, leading to four primary findings. First, Massachusetts' low income students have lower intended college enrollment rates than higher income students but also have dramatically lower skills and attend lower quality school districts. Second, inclusion of skill controls greatly reduces but does not eliminate the intended enrollment gap, with low income students seven percentage points less likely to intend enrollment than similarly skilled higher income students. Third, in districts where higher income students are plausibly unconstrained, inclusion of school district fixed effects does little to reduce intended enrollment gaps, with low income students nine percentage points less likely to intend enrollment than similarly skilled higher income students from the same school district. Fourth, low income students in the middle and upper parts of the skill distribution appear the most constrained, particularly with respect to four-year public colleges. State governments could use the methods employed here to identify credit constrained student populations in order to target financial aid more efficiently

Columbia University Academic Commons

Recommended from our members

Skills, Schools, and Credit Constraints

Author: Goodman Joshua
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2008
Field of study

Columbia University Academic Commons

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Parsing Inside-Out

Author: Goodman Joshua
Publication venue
Publication date: 01/01/1998
Field of study

The inside-outside probabilities are typically used for reestimating Probabilistic Context Free Grammars (PCFGs), just as the forward-backward probabilities are typically used for reestimating HMMs. I show several novel uses, including improving parser accuracy by matching parsing algorithms to evaluation criteria; speeding up DOP parsing by 500 times; and 30 times faster PCFG thresholding at a given accuracy level. I also give an elegant, state-of-the-art grammar formalism, which can be used to compute inside-outside probabilities; and a parser description formalism, which makes it easy to derive inside-outside formulas and many others.Comment: Ph.D. Thesis, 257 pages, 40 postscript figure

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

The Wages of Sinistrality: Handedness, Brain Structure and Human Capital Accumulation

Author: Goodman Joshua Samuel
Publication venue: John F. Kennedy School of Government, Harvard University
Publication date: 18/01/2012
Field of study

Left- and right-handed individuals have different brain structures, particularly in relation to language processing. Using five data sets from the US and UK, I show that poor infant health increases the likelihood of a child being left-handed. I argue that handedness can thus be used to explore the long-run impacts of differential brain structure generated in part by poor infant health. Even conditional on infant health and family background, lefties exhibit economically and statistically significant human capital deficits relative to righties. Compared to righties, lefties score a tenth of a standard deviation lower on measures of cognitive skill and, contrary to popular wisdom, are not over-represented at the high end of the distribution. Lefties have more emotional and behavioral problems, have more learning disabilities such as dyslexia, complete less schooling, and work in less cognitively intensive occupations. Differences between left- and right-handed siblings are similar in magnitude. Most strikingly, lefties have six percent lower annual earnings than righties, a gap that can largely be explained by these differences in cognitive skill, disabilities, schooling and occupational choice. Lefties work in more manually intensive occupations than do righties, further suggesting that lefties’ primary labor market disadvantage is cognitive rather than physical. Those likely be left-handed due to genetics show smaller or no deficits relative to righties, suggesting the importance of environmental shocks as the source of disadvantage. Handedness provides parents and schools a costlessly observable characteristic with which to identify young children whose cognitive and behavioral development may warrant additional attention

Harvard University - DASH

Recommended from our members

Gold Standards?: State Standards Reform and Student Achievement

Author: Goodman Joshua Samuel
Publication venue: John F. Kennedy School of Government, Harvard University.
Publication date: 06/08/2012
Field of study

Proponents of the recent and widely adopted Common Core State Standards argue that high quality curricular standards are critical to students’ educational success. Little clear evidence exists, however, linking the quality of such standards to student achievement. I remedy this by connecting data on state-level student achievement from 1994-2011 with measures of the quality of states’ curricular standards as judged by two independent organizations at three different moments in time. I show that, within states, changes in the quality of standards have little impact on overall student achievement. Improved standards do, however, raise achievement of 8th graders in low-scoring states, particularly for low-scoring students. Given the known weaknesses of U.S. middle schools, this result suggests that standards may be beneficial in settings where pedagogy would otherwise be poor

Harvard University - DASH